Clustering by passing messages between data points.
نویسندگان
چکیده
Clustering data by identifying a subset of representative examples is important for processing sensory signals and detecting patterns in data. Such "exemplars" can be found by randomly choosing an initial subset of data points and then iteratively refining it, but this works well only if that initial choice is close to a good solution. We devised a method called "affinity propagation," which takes as input measures of similarity between pairs of data points. Real-valued messages are exchanged between data points until a high-quality set of exemplars and corresponding clusters gradually emerges. We used affinity propagation to cluster images of faces, detect genes in microarray data, identify representative sentences in this manuscript, and identify cities that are efficiently accessed by airline travel. Affinity propagation found clusters with much lower error than other methods, and it did so in less than one-hundredth the amount of time.
منابع مشابه
Replication on Affinity Propagation: Clustering by Passing Messages Between Data Points
In this project, I choose the paper, Clustering by Passing Messages Between Data Points [1], published in Science 2007, as the work to replicate. In [1], a new clustering algorithm named Affinity Propagation is proposed. In my report, three research hypothesizes in this paper that are related to the performance, evaluation and scalability of the algorithm are studied. I implement the algorithm ...
متن کاملAffinity Propagation: Clustering Data by Passing Messages
AFFINITY PROPAGATION: CLUSTERING DATA BY PASSING MESSAGES Delbert Dueck Doctor of Philosophy Graduate Department of Electrical & Computer Engineering University of Toronto 2009 Clustering data by identifying a subset of representative examples is important for detecting patterns in data and in processing sensory signals. Such “exemplars” can be found by randomly choosing an initial subset of da...
متن کاملLocal and global approaches of affinity propagation clustering for large scale data
Recently a new clustering algorithm called ‘affinity propagation’ (AP) has been proposed, which efficiently clustered sparsely related data by passing messages between data points. However, we want to cluster large scale data where the similarities are not sparse in many cases. This paper presents two variants of AP for grouping large scale data with a dense similarity matrix. The local approac...
متن کاملComment on "Clustering by passing messages between data points".
Frey and Dueck (Reports, 16 February 2007, p. 972) described an algorithm termed "affinity propagation" (AP) as a promising alternative to traditional data clustering procedures. We demonstrate that a well-established heuristic for the p-median problem often obtains clustering solutions with lower error than AP and produces these solutions in comparable computation time.
متن کاملUsing Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council
Supervised clustering is a data mining technique that assigns a set of data to predefined classes by analyzing dataset attributes. It is considered as an important technique for information retrieval, management, and mining in information systems. Since customer satisfaction is the main goal of organizations in modern society, to meet the requirements, 137 call center of Tehran city council is ...
متن کاملUsing Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council
Supervised clustering is a data mining technique that assigns a set of data to predefined classes by analyzing dataset attributes. It is considered as an important technique for information retrieval, management, and mining in information systems. Since customer satisfaction is the main goal of organizations in modern society, to meet the requirements, 137 call center of Tehran city council is ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Science
دوره 315 5814 شماره
صفحات -
تاریخ انتشار 2007